Apache Iceberg
8 items using Apache Iceberg
Projects
Blog Posts
Iceberg Series, Part 6: Multi-Engine & Maintenance
Querying Iceberg from Trino, Flink, and DuckDB; expiring snapshots; rewriting data files; and keeping Iceberg tables healthy in production.
Iceberg Series, Part 5: Row-Level Operations
How MERGE, UPDATE, and DELETE work in Iceberg — copy-on-write vs merge-on-read, when to use each, and the performance trade-offs.
Iceberg Series, Part 4: Hidden Partitioning & Evolution
Partition transforms that derive partition values automatically, partition evolution that changes strategy without rewriting data, and why these are Iceberg's biggest ergonomic wins.
Iceberg Series, Part 3: Catalogs
How Hive, Glue, REST, and Nessie catalogs coordinate multi-engine access to Iceberg tables — and why the catalog abstraction is Iceberg's biggest differentiator.
Iceberg Series, Part 2: Table Format Internals
The four-layer metadata hierarchy — table metadata, manifest lists, manifest files, and data files — and how it enables efficient scans and snapshot isolation.
Iceberg Series, Part 1: Getting Started
Creating Iceberg tables with Spark, reads, writes, MERGE, time travel, and inspecting table history.
Iceberg Series, Part 0: Overview
What is Apache Iceberg, how does it differ from Delta Lake and Hudi, and why multi-engine interoperability is its defining advantage.